After record-setting quarters in 2021 banning hateful comments and posts, Meta removed or flagged 16 million content pieces containing hate speech on Facebook and Instagram between January and March of 2024 according to its most recent Transparency Report, staying steady compared to the previous year and quarter. The prevalence rate of users encountering hateful posts and comments has allegedly also decreased to between 0.01 and 0.02 percent on Facebook since 2020, while it increased to 0.03 percent on Instagram during the first months of 2024, meaning one or two out of 10,000 content pieces on Facebook and three out of 10,000 content pieces on Instagram containing hate speech slipped past Meta's flagging and deletion processes.
This can partly be attributed to the improvements in the platform's AI algorithms. Still, user-reported content violating Meta's hate speech policy stood at about five percent, on par with the previous two quarters. Heavily relying on algorithms has its downsides, though: In Q1 of 2024, 148,000 content pieces removed for hate speech were restored later, 145,000 through automatic processes not requiring a manual appeal.
The relationship between content acted upon and later being reinstated is especially striking between April and June 2023 for both platforms. Of the 18 million content pieces flagged or removed on Facebook, 11 percent were reported by users. One third of these content pieces were reinstated automatically, while 923,000 posts, comments or videos were put back online after an appeal of the takedown was reviewed. Instagram had a ratio of almost 40 percent of content struck down later being put online again in the same period. Meta gives no specific reason for this outlier.
Since Meta started publishing its Community Standards Enforcement Report every quarter to create more transparency concerning their moderation measures, the total amount of flagged or removed content pieces containing hate speech as well as its proactive action rate reached a historic high in the second quarter of 2021 and has been dropping steadily until the first quarter of 2023, after which it briefly rose to 2021 levels. The company's Community Standards define hate speech as “direct attacks against people — rather than concepts or institutions— on the basis of what we call protected characteristics (PCs): race, ethnicity, national origin, disability, religious affiliation, caste, sexual orientation, sex, gender identity, and serious disease.”